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ABSTRACT 


Inference  planning  techniques  have  been  Implemented  and  incor- 
porated within  a prototype  deductive  processor  designed  to  support 
the  extraction  of  information  implied  by,  but  not  explicitly  inclu- 
ded in,  the  contents  of  a relatlonally  structured  data  base. 
Deductive  pathfinding  and  Inference  planning  are  used  to  select 
small  sets  of  relevant  premises^ and  to  construct  skeletal  deriva- 
tions. When  these  -‘'skeletons M^sre  verified,  the  system  uses  them 
as  plans  to  create  data-base  access  strategies  that  guide  the 
retrieval  of  data  values,  to  assemble  answers  to  user  requests,  and 
to  produce  proofs  supporting  those  answers.  Several  examples  are 
presented  to  illustrate  the  current  capability  of  the  prototype 
Deductively  Augmented  Data  Management  (DADM)  system.^ 


INTRODUCTION 


Not  only  are  computerized  data  bases  growing  in  size,  number, 
and  complexity,  but  the  number  of  on-line  users  is  also  growing 
rapidly.  The  availability  of  larger  and  cheaper  memories  is  making 
it  feasible  to  store  vast  quantities  of  data  on-line,  but  this 
often  serves  only  to  Increase  the  frustration  of  users,  who, 
because  of  limitations  in  current  data-base  retrieval  technology, 
are  unable  to  take  full  advantage  of  the  Information.  A major 
deficiency  in  present  data-base  systmns  is  an  inability  to  dis- 
cover (at  the  direction  of  users)  implicit  relationships  among  the 
data  items  explicitly  present. 
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Deductive  logic  offers  considerable  potential  for  lap roving 
on-line  access  to  large,  complex  data-base  domains . The  prototype 
Deductively  Augmented  Data  Management  (DADM)  system  described  in 
this  paper  has  been  designed  to: 


se  complex  and  subtle  queries  to  the 


In  particular,  user-system  Interactive  techniques  have  been 
developed  whereby  the  system  creates  and  displays  inference  plans 
and  chains  of  evidence  as  an  integral  part  of  the  question-answering 
process.  The  user  actively  participates  by  supplying  advice, 
refining  his  queries,  and  requesting  additional  plans  and  evidence 
as  necessary.  This  interactive  cycle  continues  until  the  user  is 
satisfied  with  the  quality  as  well  as  the  quantity  of  the  derived 
information.  Sometimes  this  entails  the  provision  of  evidence  both 
for  and  against  a user's  conjecture  or  working  hypothesis.  Some- 
times the  system  provides  a user  with  a conditional  (yes  if...) 
answer  rather  than  a strictly  categorical  answer.  In  all  cases, 
the  system  permits  a user  to  ask  for  corroborative  evidence  by 
requesting  alternative  derivations  for  an  answer.  (Multiple  evi- 
dence chains  may  often  reinforce  the  user's  confidence  in  the 
value  of  the  information  received.) 


APPROACH 

The  design  for  the  deductive  processor  described  in  this  paper 
evolved  out  of  research  on  an  English  question-answering  system 
called  CONVERSE  (Kellogg  et  al.  (1971]  and  Travis  et  al.  [1973]). 
This  system  consisted  of  a language  processor  (driven  by  English 
syntax  rules  and  a semantic  network)  and  a relational  data  manage- 
ment aystem  that  accessed  specific  facts  realized  as  F tuple  mem- 
bers of  predicate  (relation)  extensions.  When  analyzing  a query 
such  as  "Who  is  mayor  of  Denver?",  the  system  would  use  its  semantic 
network  to  infer  that  the  reference  was  to  the  City  of  Denver,  not 
the  County  of  Denver.  The  inference  was  based  on  the  general  pro- 
position, represented  in  the  semantic  network,  that  the  range  of 
the  relation  being  mayor  of  Includes  cities  but  not  counties. 


* It  is  important  to  notq  that  while  the  deductive  processor  will  be 

applying  rules  of  strict  logical  reasoning,  the  information  (the  set  gd 

of  general  assertions  or  premises)  that  is  being  used  to  construct  Whi,e  Section  p 

evidence  chains  may  range  in  degree  of  plausibility  from  "hard"  Suit  Section 

(strictly  true)  to  "soft"  (possibly  the  cause) . HCEO 
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Further,  in  analyzing  no re  complex  queries  such  as  "What  cities  are 
in  states  with  a population  less  than  that  of  the  City  of  Boston?", 
the  system  would  infer  which  states  possess  the  property  of  having 
a population  smaller  than  that  of  Boston — an  ad  hoc  property  not 
directly  available  in  the  network  or  data  base.  While  useful, 
these  kinds  of  Inferences  are  special  purpose  and  limited.  We 
decided  that  a more  general-purpose  inferential  capability  needed 
to  be  designed  and  added  to  the  system  for  use  in  many  different 
contexts  and  for  many  different  purposes  (Klahr  (1975],  Kellogg  et 
al.  (1976],  Klahr  (1978],  and  Kellogg  et  al.  (1977]). 

Two  design  criteria  were  crucial  in  the  development  of  the 
deductive  processor  (DP) . The  first  criterion  was  that  the  DP 
would  be  an  independent  system  yet  capable  of  being  "added  on"  to 
existing  and  emerging  relational  data  management  systems  (KDMSs) . 
This  led  to  a distinct  separation  between  a store  of  extensional 
data  (specific  facts)  and  a store  of  lntensional  data  (general 
statements,  premises,  rules).  The  former  is  accessed  by  an  RDMS, 
while  the  latter  is  accessed  by  the  DP  (see  Figure  1) . (This 
separation  of  data  is  also  suggested  in  a recent  proposal  by 
Reiter  (1978].)  No  change  is  necessary  to  the  RDMS  to  add  on  the 
DP.  This  same  criterion  of  an  RDMS  add-on  also  led  to  a focus  on 
deduction  bv  exception;  user  queries  not  requiring  deduction  should 
be  identified  as  such  and  sent  directly  to  the  RDMS. 

The  second  criterion  focused  on  the  selection  of  relevant 
premises . Premises,  or  Inference  rules,  are  general  statements 
that  can  be  used  in  making  deductions.  Given  a large  number  of 
such  premises,  a crucial  problem  arises  in  controlling  the  deduc- 
tive search  space.  An  inference  planning  process  has  been  designed 
and  Implemented  to  locate  potentially  relevant  premises.  This  pro- 
cess must  be  fast  and  efficient  to  compensate  for  the  overhead 
processing  involved.  But  such  planning  is  needed  in  order  to  give 
the  system  guidance  in  its  deductive  searching.  Furthermore,  the 
planning  process  is  used  to  guide  and  direct  relational  data-base 
searching  by  specifying  what  facts  are  needed  to  support  the 
deductions  and  proofs  found  to  answer. user  queries. 

\ 

ABSTRACTING  AM)  SEMANTICALLY  RESTRICTING  DEDUCTIVE  INTERACTIONS 

Processes  of  abstraction  (of  deductive  interactions)  and 
restriction  (of  semantic  scope)  are  central  to  our  approach  to 
relevant  premise  selection.  Where  possible  these  abstrsctlon  and 
restriction  processes  are  carried  out  during  premise  input  in  order 
to  minimize  processing  time  during  query  analysis  and  deductive 
question-answering . 
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Premises  and  queries  ere  entered  into  the  system  as  nrimitive 
conditional  statements  (Travis  at  al.  [1973]).*  A primitive  condi- 
tional la  a first-order  predicate-calculus  normal  form  whose  cen- 
tral connective  is  the  implication  sign.  The  antecedent  of  the 
implication  contains  the  assumptions  of  the  premise/query  and  the 
consequent  contains  the  goals  of  the  premie e/ query . Assumptions 
and  goals  are  literals,  that  is,  atomic  predicate  occurrences  or 
negated  atomic  predicate  occurrences.  Within  a given  antecedent  or 
consequent,  literals  may  be  combined  either  conjunctively  or  dis- 
junctively. Each  predicate  occurrence  is  an  Instance  of  a predi- 
cate (relation)  along  with  its  argument  terms  (namely  variables, 
constants,  or  functions).  Primitive  conditionals  are  used  because 
they  support  the  introduction  of  general  assertions  in  a natural 
way,  similar  to  the  way  production  rules  are  used  in  knowledge- 
based  systems;  see  Davis  and  King  [1975]. 

Several  kinds  of  information  are  abstracted  from  the  premises 
during  input  and  used  to  create  a predicate  connection  graph 
(PCG) ,**  as  well  as  other  storage  structures  that  promote  efficient 
association  of  deductive  and  semantic  information  (Klahr  [1975]). 

A premise  is  first  converted  into  a Skolemlzed,  quantifier-free 
form.  The  implication  (as  well  as  other  truth-functional)  connec- 
tions among  the  predicate  occurrences  in  a premise  are  encoded  into 
the  PCG  as  a series  of  deductive  dependency  Links . Further,  the 
deductive  interactions  (or  unifications — see  Robinson  [1965]) 
between  predicate  occurrences  in  the  new  premise  and  predicate 
occurrences  in  existing  premises  are  pre-computed  and  encoded  into 
the  PCG  as  a series  of  interpremise  associative  Arcs.  The  variable 
substitutions  required  for  unification  are  stored  elsewhere,  for 
later  use  in  verifying  skeletal  derivations  (l.e.,  inference  or 
proof)  plans. 

Semantically  restrictive  information  is  introduced  in  several 
different  forms  in  order  to  restrict  the  logically  possible  unifi- 
cations to  those  that  are  semantically  meaningful  for  particular 
application  domains. 

The  variables  and  constants  occurring  in  premises  can  be 
"typed",  that  is,  assigned  to  specific  domain  classes.  For  example, 
the  variable  "X"  might  be  assigned  the  type  DOCUMENT,  and  the  con- 
stant "Sam"  assigned  the  type  SCIENTIST.  Then,  whenever  "X"  and 
"Sam"  occur  in  the  same  argument  position  of  different  Instances 
of  a relation,  those  relation  Instances  will  not  unify,  and  they 
trill  not  be  connected  in  the  PCG,  due  to  their  semantically 

* In  an  operational  system,  premises  would  normally  be  entered  by 
the  data-base  administrator. 

**  See  Kowalski  [1975]  and  Slckel  [1976]  for  the  uae  of  connection 
graphs  in  theorem  proving. 
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incompatible  types . 

Compound  types,  consisting  of  set  union,  intersection,  end 
difference  operations  over  simple  types,  may  slso  be  used  to  specify 
acre  coaplex  seasntlc  restrictions  on  predicate  domains.  A 
seaantlc  network  is  used  to  represent  set  relationships  between 
types. * Clsss  Inclusion  psths  within  this  network  sre  used,  for 
example,  to  permit  unification  of  Instances  of  type  SCIENTIST  with 
instances  of  type  MAM1AL.  As  new  prealses  ere  entered  into  the 
systsa,  this  seaantlc  network  is  automatically  updated  to  reflect 
new  predicate- do main  associations. 

In  addition  to  this  use  of  seaantic  information  to  restrict 
unification  by  Beans  of  types,  unlflcstlon  between  aultiple  occur- 
rences of  a predicate  within  the  same  premise  Bay  soaetiaes  be 
avoided  by  restating  the  premise's  assertion  by  use  of  logical  pro- 
perties. For  example,  the  predicate  "North-of " could  be  charac- 
terized by  the  premises: 

Yx  Vy  (North-of  (x,y)  fc  North-of  (y,z)  3 North-of  (x,z)> 

Vx  Vy  (North-of  (x,y)  3 -i  North-of  (y,x)) 

Vx  (-1  North-of  (x,y) ) 

The  first  premise  specifies  that  North-of  is  transitive.  This 
premise  is  recursive  and  can  deductively  Interact  with  Itself  and 
the  other  premises  to  cause  a rapid  expansion  of  the  deductive 
search  space.  To  help  avoid  this  problem,  the  DADM  system  permits 
binary  predicates  to  be  characterized  by  their  logical  properties 
(for  example  North-of  would  be  assigned  the  logical  properties: 
transitive,  asynmetric,  and  irreflexlve) . Computational  procedures 
can  then  be  called  to  effect  special-purpose  Inferences  associated 
with  various  groupings  of  logical  properties.  Recursive  premises 
describing  logical  properties  of  predicates  sre  therefore  replaced, 
where  possible,  by  special-purpose  subroutines.  Subroutines  are 
being  lapleaented  for  consistent  coabinatlons  of  the  logical 
properties  ldentiflmd  by  Elliott  [1965].**  Future  effort  will 
involve  other  properties  such  as  a relation  being  hereditary  with 
respect  to  another  relation,  e.g.,  P being  hereditary  over  R in 

Vx  Vy  POO  4 R(*,y)  = P(y) 


* See  Me Sk lain  and  Minker  [1977,  1978]  for  related  research  on 
introducing  seaantic  information  into  a deductive  system. 

**  Properties  and  examples  are:  reflexive  (equal-to),  irreflexlve 
(greater-then),  synmetric  (equal-to),  asyanetric  (North-of),  tran- 
sitive (located-ln),  1-leader  (mother-of),  1-follower  (weighs), 
noregrowth  (son-of),  and  unlooped  (aother-of). 
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Logical  properties  of  binary  relations  are  identified  by  a 
user-system  dialog  that  is  initiated,  as  shown  below,  for  the  predi- 
cate "North-of"  (user  input  is  preceded  by  an  asterisk): 

* Define  (North-of) 

Suppose  one  thing  is  North-of  a second  thing  that  in  turn 
is  North-of  a third  thing.  Is  the  first  thing  North-of 
the  third?  y 

* Yes 

If  one  thing  is  North-of  a second  thing,  will  it  always  be 
the  case  that  the  second  is  North-of  the  first? 

* No 

Might  it  ever  be  the  case? 

* No 

After  the  third  yes/no  response,  the  system  is  able  to  identify 
"North-of"  as  a transitive,  asymmetric,  irreflexlve,  and  unlooped 
relation. 

Variable  typing  reduces  the  number  of  unifications  in  the  PCG 
by  making  use  of  semantic  domain  restrictions.  Logical  properties 
replace  some  kinds  of  recursive  premises,  and  their  often  trouble- 
some unifications,  with  special-purpose  inferenclng  procedures.  A 
third  form  of  semantic  restriction  used  in  the  DADM  system  does  not 
directly  eliminate  unifications  in  the  PCG,  but  does  limit  the 
selection  and  use  of  premises  and  predicates  by  means  of  advice 
supplied  by  a data-base  administrator  or  user  during  query  process- 
ing. 


A data-base  administrator  enters  semantic  advice  in  the  form 
of  "Conditions  -*•  Recommendations"  rules.  For  example,  one  could 
advise  that  a ship  return  to  its  home  pert  if  it  is  damaged  by 
specifying: 

(Assumption  Damaged (Ship))  * Returns (Ship  Ports) 

The  system  would  try  using  premises  containing  the  Returns  relation 
when  the  Damaged  relation  occurs  as  an  assumption.  Advice  rules 
are  stored  in  an  advice  file,  where  they  are  automatically  selected 
and  applied  whenever  their  condition  part  holds  for  input  queries. 
In  addition  to  such  advice  rules,  the  user  could  supply  advice  for 
a particular  query  by  stating  only  the  advised  recommendation  for 
that  query. 


i 
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Advice  Mit  typically  Involves  recomaendationa  on  the  use  of 
particular  premises  or  predicates  in  finding  deductions . For  ad- 
vised premises,  the  system  will  try  using  then  whenever  possible  In 
the  course  of  coimtructing  a proof.  For  advised  predicates,  the 
system  will  try  chaining  through  occurrences  of  them  In  premises. 

In  the  case  of  negative  advice,  specified  premises  and  predicates 
are  avoided  in  proof  construction. 


INFERENCE  PLANNING  AND  DEDUCTIVE  QUESTION  ANSWERING 

The  development  refinement,  and  execution  of  Inference  plans 
proceeds  through  a series  of  phases.  These  phases  are  designed 
to  progressively  apply  a series  of  increasingly  more  stringent  de- 
ductive, semantic,  and  pragmatic  constraints  until  a user  receives 
his  desired  information  or  is  convinced  that  he  has  explored  all 
reasonable  deductive  pathways  into  the  data  base.  These  phases  are 
described  below. 


L 


Deductive  Pathfinding 


Symbolic  queries  (In  the  form  of  primitive  conditionals)  are 
decomposed  into  a set  of  assumptions  (antecedents  of  the  condi- 
tional) and  a set  of  goals  (consequents  of  the  conditional) . 
Deductive  pathfinding  employs  a process  of  middle-term  chaining 
(Klahr  (1978])  to  be  illustrated  later.  This  process  uses  the 
predicate  connection  graph  to  find  chains  of  middle-term  predicates 
needed  to  deductively  connect  assumptions  to  goals.  Middle-term 
chaining  combines  the  processes  of  forward  chaining  from  the  assump- 
tions in  a query  and  backward  chaining  from  the  goal6  in  a query. 
When  a query  contains  no  assumptions,  and  the  system  cannot  dis- 
cover plausible  ones  to  use— say,  as  a result  of  semantic  advice— 
middle-term  chaining  defaults  to  backward  chaining.  As  chaining 
proceeds,  a series  of  expanding  deductive-interaction  "wave  fronts" 
are  generated  from  assumptions  toward  goals  and  from  goals  toward 
assmptlons . Intersections  are  performed  on  the  wave  fronts 
until  a non-empty  intersection  occurs,  at  which  time  the  system  has 
found  an  implication  chain  from  an  assumption  to  s goal.  Several 
such  Implication  chains  are  usually  found  (shortest  chains  first) 
before  s user-controlled  limit  is  reached.  Middle-term  chaining 
is  further  constrained  by  the  use  of  semantic  advice  and  plausi- 
bility measures.  The  plausibility  measures  are  assigned  to  premises 
and  are  used  to  order  the  predlcste  occurrences  comprising  middle- 
term  chain  wave  fronts  to  ensure  that  the  deductive  paths  involving 
the  most  plausible  premises  are  selected  first.  In  a similar 
fashion,  semantic  advice  obtained  from  the  advice  file  or  from  the 
user  is  transformed  into  premise  and  predicate  alert  lists  that  are 
used  to  ensure  that  advised  premises  and  predicates  are  given  pri- 
ority or  avoided,  depending  upon  whether  the  advice  is  positive  or 
negative. 
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The  same  assumptions  say  be  used  to  find  deductive  support  for 
different  goals  (and  subgoals).  When  sssuaptlons  are  not  supplied 
in  a query,  useful  assumptions  may  sometimes  be  found  by  following 
semantic  network  predicate-domain  connections,  or  by  using  advised 
predicates  as  possible  assumptions. 

Plan  Generation 

\ 

For  each  middle-term  chain  generated,  the  system  extrsets  the 
premises  whose  occurrences  are  part  of  the  chain.  Subgoals  result- 
ing from  the  premises  are  set  up  to  be  resolved  either  by  deductive 
support  through  the  premises,  by  data-base  search  through  the  rela- 
tional file,  or  by  procedural  computation.  Subgoals  sre  added  to 
a proof-proposal  tree,  which  contains  the  Inference  plans  being 
formed  and  developed.  Once  inference  plans  have  no  remaining  deduc- 
tive subgoals,  they  are  available  for  verification,  user  review, 
and  instantiation. 

Plan  Verification 

Skeletal  plans  constructed  during  plan  generation  are  valid 
proofs  at  the  truth-functional  level.  In  plan  verification,  the 
variable  substitutions  associated  with  the  unifications  in  each 
plan  are  examined  for  consistency.  If  there  are  no  clashes — that 
is,  if  no  variables  are  assigned  more  than  one  distinct  constant 
value — then  verification  Is  successful  and  instantiation  by  data- 
base search  may  follow.  During  this  stage,  classes  of  variables 
that  must  take  on  the  same  value  are  constructed  and  used  to  refor- 
mulate skeletal  derivations  into  search-comnute  plan  components 
(i.e.,  data-base  access  strategies)  and  inference  plan  components 
(comprising  deduced  goals,  deduced  subgoals,  and  assumptions). 

Plan  Review,  Plan  Selection,  and  Query  Refinement 

Though  on-line  interaction  may  be  initiated  by  the  user  or 
prompted  by  the  system  at  various  points  during  pathfinding  and 
plan  generation,  most  user  review  and  interaction  occurs  after  plan 
verification.  Verified  plans  are  usually  reviewed  in  the  order  in 
which  they  were  generated.  (Recall  that  plans  using  the  shortest 
paths,  most  plausible  premises,  and  advised  premises  sod  predicates 
are  generated  first.) 

During  review,  a user  may  reject  a plan,  instantiate  it  (by 
requesting  data-base  search)  or  suspend  further  action  on  it  until 
other  plans  have  been  reviewed.  In  this  manner,  the  user  can  mini- 
mise unnecessary  data-base  searching  by  reviewing  the  derived  plan 
information  and  reaching  conclusions  about  the  likely  data-base 
searching  consequences  of  his  original  request.  Plan  review  may, 
for  example  Indicate  that  additional  assumptions,  goals,  or  advice 
should  be  associated  with  the  original  request,  or  that  the  original 
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query  should  be  refined  or  replaced  by  e nore  specific  (or  general) 
request.  Considerable  insight  into  interpreting  coaplex  requests 
with  respect  to  large  data  bases  can  be  achieved,  short  of  actually 
searching  the  data  base,  by  this  process. 

Data-Base  Search  and  Answer  Generation 

An  inference  plan  constitutes  a couplet e proof  just  in  case  ix> 
search/ compute  plan  is  produced  (i.e.,  all  subgoals  are  deduced 
from  premises).  More  typically,  one  or  nore  subgoals  require  data- 
base and/or  procedural  (compute)  support.  Search/ compute  plans  are 
executed,  in  general,  in  three  phases:  first,  all  computable 
functions  and  predicates  having  only  constants  as  arguments  are 
evaluated;  aecond,  a sequence  of  relational  search  requests  is  exe- 
cuted against  the  data  base;  third,  remaining  computable  functions 
and  predicates  are  applied  to  the  results  of  data-base  search. 
Answers  are  extracted  from  the  M-tuples  of  data  values  associated 
with  search/ compute  plan  variables.  (Each  of  these  N-tuples 
supplies  instantiation  values  that  may  be  used  to  convert  the  ori- 
ginal Inference  plan  into  a complete  proof  or  "chain  of  evidence".) 
An  answer  may  be  categorical  (for  example,  "yes"  if  no  variables 
occur  in  the  original  request,  and  data-base  search  is  sstlsfied), 
descriptive  (a  set  of  search-derived  query-variable  values  displayed 
in  tabular  format),  or  conditional  ("yes  if..."  the  apeclfied 
predicate-argument  conditions  can  be  verified  by  the  user  to  hold 
true  for  the  application  domain). 

Often  these  categorical,  descriptive,  or  conditional  answers 
will  satisfy  the  user's  original  Information  requirement.  In  other 
cases,  he  may  wish  to  proceed  to  the  next  (and  final)  step  in  the 
Inference  plan  development -execut ion-review  cycle. 

Answer  Explanation  and  Evidence  Review 

Just  as  the  plan  review,  plan  selection,  and  query  refinement 
process  is  designed  to  aid  the  user  in  understanding  the  full  com- 
puter-developed implications  of  his  query,  the  answer  explanation 
and  evidence  review  phase  of  processing  is  designed  to  support  him 
in  his  evaluation  of  computer-derived  answers.  In  a later  section, 
several  computer  examples  illustrate  current  proof  displays.  Though 
this  form  is  often  sufficient  to  enable  users  to  determine  the 
validity  and/or  utility  of  derived  answers,  a more  interactive  and 
easily  comprehended  dialog  format  for  evidence  display  is  under 
development.  This  new  facility  will  permit  a user  to  selectively 
interrogate  the  system  concerning  particular  answers,  relations, 
and  domains.  By  repetitive  interrogation,  he  may  delve  as  deeply 
as  he  desires  into  particular  lines  of  reasoning  or  evidentary 
aupport,  without  resorting  to  the  current  practice  of  full  proof 
display . 
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Inference  Planning,  Data-Base  Semantics, 
and  Generalized  Navigation 

The  relational  (extenslonal)  data  haze  constitutes  a logical 
■odel  or  interpretation  for  many  of  the  relations  used  in  the  pre- 
mise (intensional)  file.  Conversely,  the  intenslonal  information 
constitutes  a partial  but  precise  representation  of  the  semantics 
of  the  extenslonal  data  base.  Inference  planning  uses  this  Inten- 
slonal information  to  develop  both  the  semantic  Implications  of 
user-request  assumptions  and  the  semantic  antecedents  of  user- 
request  goals.  Therefore,  Inference  planning  may  be  used  to  support 
generalized  navigation  or  browsing  operations  through  the  semantics 
of  a data  base.  Generalized  navigation  is  further  supported  by 
allowing  users  to  enter  requests  containing  unrestricted  relations 
(i.e.,  relations  with  no  arguments).  Given  queries  of  this  sort, 
the  system  can  quickly  find  deductive  paths  through  system  restric- 
ted concepts  supporting  goal  relations  and  concepts  linking  assump- 
tions to  goals.  This  system  feature  has  proved  most  useful  as  a 
tool  for  exploring  the  interrelationships  between  intensional  con- 
cepts. 


DEDUCTIVE  PROCESSOR  COMPONENTS 

Figure  2 shows  the  components  of  our  DADM  system  prototype. 

At  present,  users  communicate  directly  with  the  control  processor; 
a language  processor  will  be  incorporated  at  a later  date.  The 
control  processor  accepts  premises  and  queries  in  primitive  condi 
tional  form  as  well  as  user  advice  and  commands.  It  accesses  and 
coordinates  the  use  of  the  several  system  components  briefly  des- 
cribed below. 


Array  Initialization  and  Maintenance 

Information  abstracted  from  the  premises  is  segmented  into 
seven  internal  arrays.  This  segmentation  contributes  to  system 
modularization  and  increases  processing  efficiency.  The  seven 
arrays  are: 

(1)  Premise  Array.  Each  entry  represents  a premise  and  con- 
tains a list  of  the  predicate  occurrences  in  the  premise,  the 
plausibility  of  the  premise,  and  the  premise  itself  (both  symbolic 
and  English)  for  purposes  of  display. 

(2)  Predicate  Array.  The  predicate  array  contains  the  rela- 
tions krown  to  the  system  as  well  as  the  support  Indicator  asso- 
ciated with  each  relation,  which  indicates  how  to  resolve  each  rela- 
tion when  it  occurs  as  a subgoal  (deduce,  search  data  base,  compute) . 


Figure  2.  Deductive  Processor  Components 
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(3)  Predicate  Occurrence  Array.  Each  entry  represents  a 
predicate  occurrence  and  contains  the  following  Information  about 
the  occurrence:  its  predicate  name  (index  Into  predicate  array), 
the  sign  of  the  occurrence  (positive  or  negative),  whether  the 
occurrence  Is  In  the  antecedent  or  consequent  of  the  Implication, 
the  main  connective  (conjunction  or  dlajunctlon)  governing  the 
occurrence,  and  the  numerical  position  of  the  occurence  within  Its 
premise.  The  Information  Is  compactly  stored  In  a single-word  bit 
vector  to  save  storage  space. 

(4)  Argument  Array.  The  argument  strings  of  the  predicate 
occurrences  are  stored  In  the  argument  array  in  one-to-one  corre- 
spondence to  the  positions  of  the  occurrences  in  the  predicate 
occurrence  array. 

(5)  Link  Array.  Truth-functional  dependencies  within  pre- 
mises are  stored  in  the  link  array.  These  dependencies  can  be 
implicational,  disjunctive,  or  conjunctive.  For  each  predicate 
occurrence,  a list  of  the  occurrences  with  which  it  is  truth-func- 
tionally  connected  is  entered  into  the  array. 

(6)  Unifications  Array.  Each  entry  contains  a list  of  the 
unifications  (deductive  interactions)  associated  with  the  given 
occurrence.  The  unifications  array  and  the  links  array  comprise 
the  predicate  connection  graph. 

(7)  Variable-Substitutions  Array.  The  substitution  lists 
associated  with  unifications  are  stored  in  one-to-one  correspon- 
dence with  the  positions  of  the  unifications  in  the  unification 
array . 


Chain  Generator,  Plan  Generator,  and  Plan  Verifier 

The  Chain  Generator,  Plan  Generator,  and  Plan  Verifier  support 
the  deductive  pathfinding,  plan  generation,  and  verification  pro- 
cesses. They  conmunicate  with  one  another  by  means  of  the  control 
processor  and  with  the  user  by  means  of  the  display  processor. 

Display  Processor 

Plan  and  proof  (evidence)  review  and  query  refinement  pro- 
cesaes  are  supported  by  the  Display  Processor.  The  user  can,  for 
example,  examine  middle-term  chains  generated,  plans  formed,  sub- 
goals, verified  plans,  data-base  search  requests,  data-base  values 
returned,  answers,  completed  proofs,  and  premises  used  in  proofs. 
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DEDUCTION  EXAMPLES 

Figures  3 and  4 Illustrate  the  current  operation  of  the  deduc- 
tive processor  (DP)  prototype  interfaced  to  a snail  ROMS.  (Both  DP 
and  RMS  are  written  in  LISP  1.5  and  operate  on  SDC's  Amdahl  470/V5 
computer . ) 

The  first  example  Illustrates  the  generation  of  short  Inference 
and  search/compute  plans  for  the  question,  'Vhat  ships  are  closer  to 
the  Klttyhawk's  home  port  than  the  Klttyhavk  is?"  The  query  is 
first  shorn  in  English  and  then  in  the  primitive  conditional  sym- 
bolic form  that  the  prototype  currently  recognises.  The  query  is 
expressed  in  terms  of  a conjunctive  goal  composed  of  the  predicates 
CLOSER -THAN  and  HOME-PORT . Constants  (such  as  Klttyhavk)  are 
specified  by  being  enclosed  in  parentheses,  while  variables  (such 
as  x and  y)  are  not.  One  of  the  query  goals  (HOME-PORT)  is  to  be 
given  data-base  support;  that  is,  it  has  been  defined  by  data  base 
values,  while  the  other  goal  (CLOSER-THAN)  is  to  be  deduced.  Since 
the  antecedent  in  the  query  is  empty,  the  system  back-chains  from 
CLOSER-THAN  through  premise  29.  The  plausibility  of  the  plan  in 
this  case  is  simply  the  plausibility  of  the  single  premise  used 
(plausibility  measures  are  assigned  by  the  data-base  administretor 
and  range  from  1 (very  low  plausibility)  to  99  (always  the  cese)). 
TWo  new  search  requests  (in  addition  to  HOME-PORT)  result  from  pre- 
mise 29,  as  well  as  a compute  relation  containing  functional  argu- 
ments. Computations  for  the  functions  and  the  relation  are  delayed 
until  values  for  the  variables  x and  y (the  values  needed  to  satisfy 
the  search  requests)  have  been  found  in  the  data  base. 

The  system  sends  the  four  search  requests  to  the  ROMS,  which 
finds  two  ships,  the  Forrestal  and  the  Crldley,  that  are  closer 
to  the  Klttyhawk's  hone  port  (San  Diego)  than  the  Klttyhavk  is. 

The  system  then  displays  the  proof  that  lad  to  the  first  answer 
(the  Forrestal).  A proof  using  die  other  answer  would  be  Identical 
to  this  one  except  that  Crldley  would  replace  Forrestal  in  the 
proof,  and  the  distance  between  the  Crldley  and  San  Diego  would 
replace  310  (the  distance  between  the  Forrestal  and  San  Diego) . 

The  symbols  G2,  G3,  etc.,  represent  nodes  in  the  proof  proposal 
tree  and  are  used  here  for  reference.  C2  and  G3  represent  the 
original  goals  as  also  shown  in  the  Inference  plan.  G5,  C6,  and 
G7  are  subgosls  that  resulted  from  promise  29,  which  was  used  to 
deduce  G2.  Thus,  these  three  subgoals  are  Indented  below  G2. 

The  mlddle-tern-chalnlng  and  planning  processes  are  more  evi- 
dent in  the  example  in  Figure  4.  The  input  query  contains  two 
assumptions  (DAMAGED  and  DESTINATION)  and  one  goal  (TRANSPORT) . 
Taurus  and  NY  are  constants;  Cargo  and  x are  variables.  The  query 
asks  the  system  to  find  values  for  x that  satisfy  the  query.  The 
variable  x is  also  restricted  to  range  over  ships.  This  is  an 
example  of  a type  restriction  on  a variable.  In  the  course  of 
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♦WHAT  SHIPS  ARE  CLOSER  TO  THE  KITTYHAWK'S  HOME  PORT 
♦THAN  THE  KITTYHAWK  IS? 

QUERY ( ( ( ) IMP(AND (CLOSER-THAN  X (KITTYHAWK)  Y) 
(HOME-PORT  (KITTYHAWK)  Y)))) 


INFERENCE  PLAN: 

DEDUCE  G2  *CLOSER-THAN  X KITTYHAWK  Y 
SEARCH  G3  *HOME-PORT  KITTYHAWK  Y 
PREMISES  USED:  (29)  PLAN  PLAUSIBILITY: 


SEARCH  G3  ♦HOME 
PREMISES  USED:  (29) 

SEARCH/COMPUTE  PLAN: 

SEARCH  *SHIPS  KITTYHAWK 

SEARCH  *SHIPS  X 

SEARCH  *HOME-PORT  KITTYHAWK  Y 

COMPUTE  *GREATER-THAN  (DISTANCE-BETWEEN  KITTYHAWK  Y)  ( 

DISTANCE-BETWEEN  X Y) 

ENTERING  DATA  BASE 
DATA-BASE  SEARCH  SUCCESSFUL 
*************** 

ANSWER  SUMMARY  - 
VARIABLES: 

(X  Y) 

ANSWERS: 

(FORRESTAL  SAN-DIEGO) 

(GRIDLEY  SAN-DIEGO) 

*************** 

PROOF  DISPLAY: 

DEDUCED  G2  ‘CLOSER -THAN  FORRESTAL  KITTYHAWK  SAN-DIEGO 
FACT  G5  “SHIPS  KITTYHAWK 
FACT  66  “SHIPS  FORRESTAL 
COMPUTED  G7  “GREATER-THAN  378  310 
FACT  63  ‘HOME -PORT  KITTYHAWK  SAN-DIEGO 
PREMISES  USED:  (29)  PROOF  PLAUSIBILITY:  99 
TYPE  PREMISE  NUMBER  TO  DISPLAY,  OR  'END': 

29 

((ALL  X79)  (ALL  X80)  (ALL  X81 ) 

(AND  (SHIPS  X79)  (SHIPS  X80)) 

(GREATER-THAN  (DISTANCE-BETWEEN  X79  X81 ) 

(DISTANCE-BETWEEN  X80  X81 ) ) ) 

IMP  (CLOSER-THAN  X80  X79  X81 ) ) 

PLAUSIBILITY:  99 

TYPE  PREMISE  NUMBER  TO  DISPLAY,  OR  'END1: 

END 

EM)  DISPLAY 

Figure  3.  Deduction  Involving  Deduce,  Deta-Bese  Search,  and 
Compute  Predicates 
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*IF  THE  TAURUS  WERE  DAMAGED  WHILE  DESTINED  FOR  NEW 
•YORK  BTH  A CARGO,  WHAT  SHIPS  COULD  TRANSPORT  THE 
•CARGO  TO  NEW  YORK? 

QUERY (((WHAT  (SHIP  . X)) 

(AND  (DAMAGED  (TAURUS)) 

(DESTINATION  (TAURUS)  (NY)  CARGO)) 

IMP  (TRANSPORT  X CARGO  (NY)))) 

INFERENCE  PLAN: 

DEDUCE  G1  ‘TRANSPORT  SHIPIX  X75  NY 
ASSUME  *DESTINATION  TAURUS  NY  X75 


DEDUCE 

ASSUME 

MID-TERM 


G3  **OFFLQAD  TAURUS  X75  X72 
••DAMAGED  TAURUS 
••RETURNS  TAURUS  X72 


PREMISES  USED:  (23  7 15)  PLAN  PLl 
SEARCH/COMPUTE  PLAN: 

SEARCH  *HOME-PORT  TAURUS  X72 

SEARCH  *CARRY  TAURUS  X75 

SEARCH  *AVAILABLE  SHIPIX  X72 

ENTERING  DATA  BASE 
DATA-BASE  SEARCH  SUCCESSFUL 
*************** 

ANSWER  SUMMARY  - 
VARIABLES: 

(X) 

ANSWERS: 


PLAN  PLAUSIBILITY: 


(PISCES) 

(GEMINI) 


*************** 

PROOF  DISPLAY: 

DEDUCED  G1  ‘TRANSPORT  PISCES  OIL  NY 
ASSUME  ‘DESTINATION  TAURUS  NY  OIL 

DEDUCED  G3  “OFFLOAD  TAURUS  OIL  FREEPORT 
ASSUME  “DAMAGED  TAURUS 

MID-TERM  “RETURNS  TAURUS  FREEPORT 

FACT  G11“*HOME-PORT  TAURUS  FREEPORT 
FACT  G1 2*“CARRY  TAURUS  OIL 
FACT  G4  ‘“AVAILABLE  PISCES  FREEPORT 
PREMISES  USED:  (23  7 15)  PROOF  PLAUSIBILITY: 
END  DISPLAY 


Figure  4.  Deduction  Using  Middle-Tern  Chaining 
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developing  deductions,  the  system  will  not  allow  values  that 
belong  to  domain  classes  other  than  ships  to  be  substituted  for  x. 

The  inference  plan  shown  in  Figure  4 has  already  been  verified. 
To  see  the  planning  aechanisa  acre  clearly,  refer  to  Figure  5.  The 
first  Biddle-term  chain  generated  connects  the  DESTINATION  assump- 
tion to  the  TRANSPORT  goal  via  prealse  23.  This  is  shown  by  the 
unifications  (deductive  Interactions)  and  U2  in  Figure  5a.  The 
predicate  occurrences  involving  the  relations  AVAILABLE  and 
OFFLOAD  becoae  subproblems.  The  former  1s  to  be  given  data-base 
support;  the  latter  is  deduced  by  a Biddle-term  chain  from  the 
DAMAGED  assumption  through  premises  7 and  15.  This  chain  is  shown 
in  Figure  5b  by  the  unifications  U3,  114,  and  uj.  The  two  new  sub- 
problems are  to  be  given  data-base  support.  Thus  the  plan  generated 
uses  three  premises  and  contains  three  subproblems  requiring  data- 
base search.  The  plausibility  of  the  plan  is  calculated  by  a 
fuzzy  Intersection  (the  minimum  of  the  plausibilities  of  the 
premises  Involved — Zadeh  [1965]). 

The  plan  is  then  verified  with  variable  substitutions  Inserted 
in  the  plan  and  in  the  search  requests  (Figure  4) . Note  the 
variable  constraints  in  the  search  requests.  The  variable  X72 
represents  the  home  port  of  Taurus;  values  found  for  this  variable 
must  be  the  same  as  those  found  for  X72  1®  the  AVAILABLE  search 
request.  Thus,  those  ships  that  are  available  in  Taurus’s  home 
port  are  the  ones  we  are  interested  in.  The  proof  display  is  given 
for  the  first  answer  found  (the  Pisces). 

In  Figure  5b,  note  that  the  unifications  U4  and  U5  were  com- 
puted when  these  premises  were  first  entered  into  the  system  and 
stored  in  the  PCG.  Also  stored  in  the  PCG  were  the  truth-functional 
dependencies  within  the  premises  (for  example,  between  DAMAGED  and 
RETURNS,  between  RETURNS  and  OFFLOAD,  and  between  DESTINATION  and 
TRANSPORT).  The  unifications  uj,  U3,  and  U2  involve  query  predi- 
cates. Hence  they  were  computed  after  query  input  to  locate  possi- 
ble mlddle-term-chain  end  points.  Once  these  were  found,  only  the 
PCG  was  used  for  middle-term  chaining. 


COMPLETENESS  ISSUES 

The  deductive  logic  on  which  our  system  is  based  is  that  of  an 
extensional  first-order  predicate  calculus  where  the  issue  of  logi- 
cal completeness  often  arises.  In  our  discussion,  we  will  distin- 
guish between  expressional  completeness  and  derivational  complete- 
ness. 

By  expressional  completeness  is  meant  the  ability  to  repre- 
sent, in  our  primitive-conditional  form,  equivalents  of  all  the 
well-formed  formulas  of  a first-order  predicate  calculus.  A iwrry 


assumption: 
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gure  5.  Inference  Plan  Development  for  Query  In  Fig 


4 


DEDUCTIVE  PLANNING  AND  PATHFINDING  197 

sight  arise  because  only  one  level  of  nesting  Is  allowed  In  primi- 
tive conditionals,  i.e.,  the  conjuncts  or  dlsjuncts  of  an  antece- 
dent or  of  a consequent  suet  be  coaposed  of  literals  (negated  or 
unnegated  predicate  occurrences).  The  worry  can  be  put  to  rest, 
however,  when  It  is  recelled  that  even  stapler  noraal  forms  are 
expresslonally  complete,  for  example,  the  conjunctive  noraal  fora. 

A conjunctive  normal  form  (CNF)  expression  la  a conjunction  of  dis- 
junctions of  literals.  In  our  logic,  a primitive  conditional  with 
no  antecedents  Is  interpreted  as  unconditionally  asserting  the 
consequent.  Thus  a CNF  disjunction  can  always  be  represented  as  a 
primitive  conditional  with  a disjunctive  consequent  and  no  ante- 
cedent; and  any  CNF  expression  as  a conjunction  of  such  condi- 
tionals. Through  the  use  of  the  Inference  rules  of  slaplif icatlon 
($  & \p  i)  and  of  adjunction  (4,p  4 & ♦ ),  primitive 

conditionals  may  be  combined  or  separated  to  provide  expressional 
completeness . 

By  derivational  completeness  Is  aaant  the  ability  to  generate 
all  valid  derivations.  Our  system  is  derlvationally  complete  in 
theory,  but  the  Important  Issue  for  us  has  been  the  systca'e  prac- 
tical efficiency  and  effectlveneas  in  an  applications-orlented 
environment.  That  our  system  is  derlvationally  complete  follows 
from  the  fact  that  it  is  expresslonally  complete  and  handles  all 
of  the  deductive  interactions  associated  with  unification  (inclu- 
ding Skolem  functions)  as  used  in  resolution  systems,  as  well  a6 
all  forms  of  deductive  dependencies  that  may  occur  between  predi- 
cates (see  Klahr  (1975)  for  more  detail).  The  derivational  com- 
pleteness pro bias  for  our  system  is  analogous  to  the  completeness 
problem  for  a resolution  systea  constrained  to  use  a set-of-support 
strategy  which  has  long  been  known  to  be  derlvationally  complete 
(Wos  et  al.  [196*]) . Middle-term  chains  generated  in  response  to 
a query  initially  Involve  the  desired  conclusion  (query  goals) . 
Subsequent  chains  Involve  subgoals  resulting  from  premises  used  in 
chains  to  query  goals,  etc. 

In  practice,  almost  any  perforaance-oriented  planning  strategy 
including  ours  will  initially  apply  aelection  constraints  that  may 
preclude  certain  deductive  Interactions  from  being  considered  and 
thus  lead  to  poaslble  incompleteness.  However,  auccesslve  relaxa- 
tion of  these  selection  constraints  will  enable  the  systea  to 
achieve  ell  possible  deductive  paths. 

SUMARY  AMD  FUTURE  PLANS 

We  have  described  a deductive  processor  specifically  designed  to 
augment  relational  data  base  systems  and  user-oriented  language  pro- 
cessors. The  processes  of  deductive  pathfinding,  inference  planning, 
verification,  user  review  of  plans,  answer  extraction,  and  proof  dis- 
play have  been  outlined  and  illustrated  with  several  axaaples. 
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Several  of  the  acre  Important  design  features  that  are  Inte- 
gral to  this  approach  are: 

• Verification  (checking  for  consistency  of  variable  substi- 
tutions) and  Instantiation  (data-base  search)  are  delayed 
until  one  or  acre  global  inference  plans  have  been  con- 
constructed. 

• Precomputed  deductive  interactions  (unifications)  aaong  pre- 
■lses  are  used  to  avoid  their  constant  recoaputatlon  during 
deductive  processing. 

• Variable  types  (domain  classes)  are  used  to  semantically 
restrict  the  range  of  predicate  expressions. 

• Shortest  assumption-to-goal  deductive  paths  are  found  first. 

• Inference  plans  and  data-base  access  strategies  are  created 
from  the  premise  file  without  requiring  access  to  data-base 
values . 

• Advice  can  be  given  on  the  use  of  particular  premises  and 
predicates  to  aid  in  the  discovery  of  relevant  Inference 
plans . 

The  prototype  is  currently  being  expanded  along  several  dif- 
ferent dimensions  in  line  with  our  goal  of  eventually  Incorporating 
the  deductive  processor  into  an  operational  data  management  system 
and  language  processor  environment.  A number  of  Improvements  in 
man-machine  interaction  and  user  displays  are  being  made  to  support 
more  direct  and  flexible  control  of  plan-generation  and  data-base 
search.  Additional  semantic  constraints  on  the  generation  of  plans 
will  be  Introduced  by  expanded  use  of  the  smaantlc  network,  and  by 
extension  of  the  semantic-advice  formalism.  We  also  plan  additional 
investigations  in  the  use  of  incomplete  and  plausible  knowledge, 
and  logical  properties. 
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